Research Project: Text Engineering Tool for Ontological Scientometry
نویسنده
چکیده
The number of scientific papers grows exponentially in many disciplines. The share of online available papers grows as well. At the same time, the period of time for a paper to loose at chance to be cited anymore shortens. The decay of the citing rate shows similarity to ultradiffusional processes as for other online contents in social networks. The distribution of papers per author shows similarity to the distribution of posts per user in social networks. The rate of uncited papers for online available papers grows while some papers ‘go viral’ in terms of being cited. Summarized, the practice of scientific publishing moves towards the domain of social networks. The goal of this project is to create a text engineering tool, which can semi-automatically categorize a paper according to its type of contribution and extract relationships between them into an ontological database. Semi-automatic categorization means that the mistakes made by automatic pre-categorization and relationship-extraction will be corrected through a wikipedia-like front-end by volunteers from general public. This tool should not only help researchers and the general public to find relevant supplementary material and peers faster, but also provide more information for research funding agencies. Keywords—Scientometry, Bibliometry, Social Networks, Information Retrieval, Ontology, Semantic Technology, Formal Concept Analysis
منابع مشابه
The Evaluation of ontological representations of the SWEBOK as a revision tool
The SWEBOK represents an important milestone in reaching a broad agreement on the contents of the Software Engineering discipline. Formal ontologies thus become a tool to represent such agreement in a logicsbased framework for a number of applications. In this paper, the use of common ontological criteria in the project is described as a useful assessment tool. The use of such con...
متن کاملArabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملLexical Knowledge Engineering: MikroKosmos Revisited
This paper will describe the MikroKosmos methodology for the knowledge engineering of a computational lexicon used for text analysis. To do so, the paper outlines the general requirements for a knowledge base to be used for NLP, followed by specific requirements for building the lexical knowledge source. To highlight the issue of efficiency and reusability, the paper will contrast knowledge eng...
متن کاملA Methodology of Engineering Ontology Development for Information Retrieval
When engineering content is created and applied during the product lifecycle, it is often stored and forgotten. Since search remains text-based, engineers do not have the means to harness and reuse past designs, experiences, and mistakes. On the other hand, current information retrieval approaches based on statistical methods and keyword matching are not directly applicable to the engineering d...
متن کاملخوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1601.01887 شماره
صفحات -
تاریخ انتشار 2016